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Abstract 

Order of magnitude reasoning — reasoning by rough comparisons of the sizes of quan- 
tities — is often called "back of the envelope calculation" , with the implication that the 
calculations are quick though approximate. This paper exhibits an interesting class of con- 
straint sets in which order of magnitude reasoning is demonstrably fast. Specifically, we 
present a polynomial-time algorithm that can solve a set of constraints of the form "Points 
a and h are much closer together than points c and d." We prove that this algorithm can be 
applied if "much closer together" is interpreted either as referring to an infinite difference in 
scale or as referring to a finite difference in scale, as long as the difference in scale is greater 
than the number of variables in the constraint set. We also prove that the first-order theory 
over such constraints is decidable. 

1. Introduction 

Order of magnitude reasoning — reasoning by rough comparisons of the sizes of quantities — 
is often called "back of the envelope calculation" , vsfith the implication that the calculations 
are quick though approximate. Previous AI work on order of magnitude reasoning, however, 
has focussed on its expressive power and inferential structure, not on its computational 
leverage (Raiman, 1990; Mavrovouniotis and Stephanopoulos, 1990; Davis, 1990; Weld, 
1990). 

In this paper we exhibit an interesting case where solving a set of order of magnitude 
comparisons is demonstrably much faster than solving the analogous set of simple order 
comparisons. Specifically, given a set of constraints of the form "Points a and h are much 
closer together than points c and d," the consistency of such a set can be determined in 
low-order polynomial time. By contrast, it is easily shown that solving a set of constraints 
of the form "The distance from a to 6 is less than or equal to the distance from c to c?" in 
one dimension is NP-complete, and in higher dimensions is as hard as solving an arbitrary 
set of algebraic constraints over the reals. 

In particular, the paper presents the following results: 

1. The algorithm "solve_constraints(5)" solves a system of constraints of the form "Points 
a and b are infinitely closer than points c and c?" in polynomial time (Section 5). 

2. An improved version of the algorithm runs in time 0(max(n^Q:(n)), ne, s) where n is 
the number of variables, a{n) is the inverse Ackermann's function, e is the number of 
edges mentioned in the constraint set, and s is the size of the constraint set. (Section 
6.1). 

3. An extended version of the algorithm allows the inclusion of non-strict constraints of 
the form "Points a and h are not infinitely further apart than points c and dP The 
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running time for this modified algorithm is slower than that of solve_constraints, but 
still polynomial time. (Section 6.2) 

4. A different extension of the algorithm allows the combination of order of magnitude 
constraints on distances with order comparisons on the points of the form "Point a 
precedes point 6." (Section 6.3) 

5. The same algorithm can be applied to constraints of the form "The distance from a 
to b is less than 1/B times the distance from c to d,'' where i? is a given finite value, 
as long as B is greater than the number of variables in the constraint set. (Section 7) 

6. The first-order theory over such constraints is decidable. (Section 8) 

As preliminary steps, we begin with a small example and an informal discussion (Section 
2). We then give a formal account of order-of- magnitude spaces (Section 3) and present 
a data structure called a cluster tree, which expresses order-of-magnitude distance com- 
parisons (Section 4). We conclude the paper with a discussion of the significance of these 
results (Section 9). 

2. Examples 

Consider the following inferences: 

Example 1: I wish to buy a house and rent office space in a suburb of Metropolis. For 
obvious reasons, I want the house to be close to the school, the house to be close to the 
office, and the office to be close to the commuter train station. I am told that in Elmville 
the train station is quite far from the school, but in Newton they are close together. 

Infer that I will not be able to satisfy my constraints in Elmville, but may be able to 
in Newton. 

Example 2: The Empire State Building is much closer to the Washington Monument than 
to Versailles. The Statue of Liberty is much closer both to the Empire State Building and 
to Carnegie Hall than to the Washington Monument. 

Infer that Carnegie Hall is much closer to the Empire State Building than to Versailles. 

Example 3: You have to carry out a collection of computational tasks covering a wide 
range of difficulty. For instance 

a. Add up a column of 100 numbers. 

b. Sort a list of 10,000 elements. 

c. Invert a 100 x 100 matrix. 

d. Invert a 1000 x 1000 matrix. 

e. Given the O.D.E. x = cos{e*x), x{0) = 0, find a; (20) to 32-bit accuracy. 

f. Given a online collection of 1,000 photographs in GIF format, use state-of- 
the-art image recognition software to select all those that show a man on 
horseback. 
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g. Do a Web search until you have collected 100 pictures of men on horseback, 
using state-of-the-art image recognition software. 

h. Using state-of-the-art theorem proving software, find a proof that the me- 
dians of a triangle are concurrent. 

i. Using state-of-the-art theorem proving software, find a proof of Fermat's 
little theorem. 

It is plausible to suppose that, in many of these cases, you can say reliably that one 
task will take much longer than another, either by a human judgment or using an expert 
system. For instance, task (a) is much shorter than any of the others. Task (b) is much 
shorter than any of the others except (a) and possibly (h). Task (c) is certainly much 
shorter than (d), (f), (g), or (i). However, with certain pairs such as (c) and (h) or (c) and 
(e) it would be difficult to guess whether one is much shorter than another, or whether they 
are of comparable difficulty. 

You have a number of independent identical computers, of unknown vintage and char- 
acteristics, on which you will schedule tasks of these kinds. Note that, under these circum- 
stances, there is no way to predict the absolute time required by any of these tasks within a 
couple of orders of magnitude. Nonetheless, the comparative lengths presumably still stand. 

Given: a particular schedule of tasks on machines, infer what you can about the relative 
order of completion times. For example, given the following schedule 

Machine Ml: tasks a,b,h,d. 
Machine M2: tasks c,i. 

it should be possible to predict that (a) and (b) will complete before (c); that (c) will 
complete before (d); and that (d) will complete before (i); but it will not be possible to 
predict the order in which (c) and (h) will complete. 

In all three examples, the given information has the form "The distance between points 
W and X is much less than the distance between Y and Z". In examples 1 and 2, the 
points are geometric. In example 3, the points are the start and completion times of the 
various tasks, and the constraints on relative lengths can be put in the form "The distance 
from start(a) to end(a) is much less than the distance from start(c) to end(c)", and so on. 
In example 3, there is also ordering information: the start of each task precedes its end; the 
end of (a) is equal to the start of (b); and so on. The problem is to make inferences based 
on this weak kind of constraint. 

It should be noted that these examples are meant to be illustrative, rather than se- 
rious applications. Example 1 does not extend in any obvious way to a class of natural, 
large problems. Example 2 is implausible as a state of knowledge; how does the reasoner 
find himself knowing just the order-of-magnitude relations among distances and no other 
geometric information? Example 3 is contrived. Nonetheless, these illustrate the kinds of 
situations where order-of-magnitude relations on distance do arise; where they express a 
substantial part of the knowledge of the reasoner; and where inferences based purely on the 
order-of-magnitude comparisons can yield useful conclusions. 

The methods presented in this paper involve construing the relation "Distance D is much 
shorter than distance E" as if it were "Distance D is infinitesimal as compared to distance 
E." As we shall see, under this interpretation, systems of constraints over distances can be 
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solved efficiently. The logical foundations for dealing with infinitesimal quantities lie in the 
non-standard model of the real line with infinitesimals, developed by Abraham Robinson 
(1965). (A more readable account is given by Keisler, 1976.) Reasoning with quantities of 
infinitely different scale is known as "order of magnitude" reasoning. 

The reader may ask, "Since infinitesimals have no physical reality, what is the value 
of developing techniques for reasoning about them?" In none of the examples, after all, is 
the smaller quantity truly infinitesimal or the larger one truly infinite. In example 1 and 

2, the ratio between successive sizes is somewhere between 10 and 100; in example 3, it 
is between 100 and a rather large number difficult to estimate; but one can always give 
some kind of upper bound. It is essentially certain, for instance, that the ratio between the 
times required for tasks (a) and (i) is less than lo^'"'''""'. Why not use the best real- valued 
estimate instead? 

The first answer is that this is an idealization. Practically all physical reasoning and 
calculation rest on one idealization or another: the idealization in the situation calculus 
that time is discrete: the idealization that solid objects are rigid, employed in most mechan- 
ics programs; the idealization that such physical properties as density, temperature, and 
pressure are continuous rather than local averages over atoms, which underlies most uses of 
partial differential equations; the idealization involved in the use of the Dirac delta function; 
and so on. Our idealization here that a very short distance is infinitesimally smaller than a 
long one simplifies reasoning and yields useful results as long as care is taken to stay within 
an appropriate range of application. 

The second answer is that this is a technique of mathematical approximation, which we 
are using to turn an intractable problem into a tractable one. This would be analogous to 
linearizing a non-linear equation over a small neighborhood; or to approximating a sum by 
an integral. 

There are circumstances where we can be sure that the approximation gives an answer 
that is guaranteed exactly correct; namely if the actual ratio implicit in the comparison 
"D is much smaller than i?" is larger than the number of points involved in the system of 
constraints. This will be proven in Section 7. There is also a broader, less well-defined, class 
of problems where the approximation, though not guaranteed correct, is more reliable than 
some of the other links in the reasoning. For instance, suppose that one were to consider 
an instance of example 3 involving a couple of hundred tasks, apply order-of-magnitude 
reasoning, and come up with an answer that can be determined to be wrong. It is possible 
that the error would be due to the order-of-magnitude reasoning. However, it seems safe 
to say that, in most cases, the error is more likely to be due to a mistake in estimating the 
comparative sizes. 

3. Order-of-magnitude spaces 

An order-of-magnitude space, or om-space, is a space of geometric points. Any two points 
are separated by a distance. Two distances d and e are compared by the relation c? <C e, 
meaning "Distance d is infinitesimal compared to e" or, more loosely, "Distance d is much 
smaller than e." 

For example, let 3?* be the non-standard real line with infinitesimals. Let 3?*™ be the 
corresponding m-dimensional space. Then we can let a point of the om-space be a point in 
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jl*m rpj^g distance between two points a, b is the Euclidean distance, which is a non- negative 
value in The relation d <^ e holds for two distances d, e, if d/e is infinitesimal. 

The distance operator and the comparator are related by a number of axioms, specified 
below. The most interesting of these is called the om-triangle inequality: If ab and be are 
both much smaller than xy, then ac is much smaller than xy. This combines the ordinary 
triangle inequality "The distance ac is less than or equal to distance ab plus distance 6c" 
together with the rule from order-of- magnitude algebra, "If p <C r and q <^r thenp+g <C r." 

It will simplify the exposition below if, rather than talking about distances, we talk about 
orders of magnitude. These are defined as follows. We say that two distances d and e have 
the same order of magnitude if neither d <^ e nor e <^ d. In 3?* this is the condition that d/e 
is finite: neither infinitesimal nor infinite. (Raiman, 1990 uses the notation "c? Co e" for this 
relation.) By the rules of the order-of-magnitude calculus, this is an equivalence relation. 
Hence we can define an order of magnitude to be an equivalence class of distances under 
the relation "same order of magnitude" . For two points a, b, we define the function od(a, b) 
to be the order of magnitude of the distance from a to 5. For two orders of magnitude p, q, 
we define p ^ g if, for any representatives c? G p and e E q, d <^ e. By the rules of the order- 
of-magnitude calculus, if this holds for any representatives, it holds for all representatives. 
The advantage of using orders-of-magnitude and the function "od" , rather than distances 
and the distance function, is that it allows us to deal with logical equality rather than the 
equivalence relation "same order of magnitude" . 

For example, in the non-standard real line, let 5 be a positive infinitesimal value. Then 
values such as {1, 100, 2 - 50(5 + 1005^ . . .}, are all of the same order of magnitude, ol. The 
values {S, LOOM, 3S + e^^/* . . .} are of a difi^erent order of magnitude o2 <C ol. The values 
{1/SAO/S + . .} are of a third order of magnitude o3 ^ ol. 

Definition 1: An order-of-magnitude space (om-space) i7 consists of: 

• A set of points V; 

• A set of orders of magnitude V; 

• A distinguished value G 2?; 

• A function "od(a, 6)" mapping two points a, 6 G 'P to an order of magnitude; 

• A relation "c? <^ e" over two orders of magnitude d,e 
satisfying the following axioms: 

A.l For any orders of magnitude c?, e G 2?, exactly one of the following holds: d <^ e, 

e ^ d, d = e. 

A.2 For d, e, / G if d < e and e < / then d < /. 

(Transitivity. Together with A.l, this means that <C is a total ordering on orders of 
magnitude.) 

A.3 For any d G P, not d < 0. 

(0 is the minimal order of magnitude.) 
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A. 4 For points a,b ^V, od(a, 6) = if and only if a = 6. 
(The function od is positive definite.) 

A. 5 For points ajb&V, od{a,b) = od{b,a). 
(The function od is symmetric.) 

A. 6 For points a,b,c & V, and order of magnitude d ^V, 
if od(a, b) <^ d and od(6, c) <^ d then od(a, c) <^ d. 
(The om-triangle inequahty.) 

A. 7 There are infinitely many different orders of magnitude. 

A. 8 For any point ai E V and order of magnitude d & V, there exists an infinite set 
02,03 ■ ■ ■ such that od(aj, aj) = d for all i ^ j. 

The example we have given above of an om-space, non-standard Euclidean space, is wild 
and woolly and hard to conceptualize. Here are two simpler examples of om-spaces: 

I. Let S be an infinitesimal value. We define a point to be a polynomial in 6 with integer 
coefficients, such as 3 + 5^ — 8^^. We define an order-of-magnitude to be a power of 5. We 
define 5™ <C 5" if m > n; for example, <^ 6'^. We define od(a, b) to be the smallest power 
of 5 in a - 6. For example, od(l + 6'^ -36^,1- 55^ + 46^) = S'^. 

II. Let N be an infinite value. We define a point to be a polynomial in N with integer 
coefficients. We define an order of magnitude to be a power of N. We define <^ N'' if 
p < q; for example, iV^ ^ N^. We define od(a, b) to be the largest power of A'^ in a — 6. For 
example, od(l + TV^ - 37V^ 1 - SAT^ + 4N^) = N^. 

It can be shown that any om-space either contains a subset isomorphic to (I) or a subset 
isomorphic to (II). (This is just a special case of the general rule that any infinite total 
ordering contains either an infinite descending chain or an infinite ascending chain.) 

We will use the notation "c?^e" as an abbreviation for "c? e or c? = e" . 

4. Cluster Trees 

Let P be a finite set of points in an om-space. If the distances between different pairs of 
points in P are of different orders of magnitude, then the om-space imposes a unique tree- 
like hierarchical structure on P. The points will naturally fall into clusters, each cluster C 
being a collection of points all of which are much closer to one another than to any point in 
P outside C. The collection of all the clusters over P forms a strict tree under the subset 
relation. Moreover, the structure of this tree and the comparative sizes of different clusters 
in the tree captures all of the order-of-magnitude relations between any pair of points in P. 
The tree of clusters is thus a very powerful data structure for reasoning about points in an 
om-space, and it is, indeed, the central data structure for the algorithms we will develop in 
this paper. In this section, we give a formal definition of cluster trees and prove some basic 
results as foundations for our algorithms. 

Definition 2: Let P be a finite set of points in an om-space. A non-empty subset C C P is 
called a cluster of P if for every x,yEC,zEP — C, od{x, y) <^ od{x, z). If C is a cluster, 
the diameter of C, denoted "odiam(C)", is the maximum value of od(rz;,y) for x,y E C. 
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Figure 1: Cluster tree 

Note that the set of any single element of P is trivially a cluster of P. The entire set P 
is likewise a cluster of P. The empty set is by definition not a cluster of P. 

Lemma 1: If C and D are clusters of P, then either C(ZD,D(ZC,oiC and D are 
disjoint. 

Proof: Suppose not. Then let xECOD, y^C — D,zED — C. Since C is a cluster, 
od(a;, y) <^ od{x, z). Since D is a cluster, od{x, z) <^ od{x, y). Thus we have a contradiction. 

□ 

By virtue of lemma 1, the clusters of a set P form a tree. We now develop a representa- 
tion of the order of magnitude relations in P by constructing a tree whose nodes correspond 
to the clusters of P, labelled with an indication of the relative size of each cluster. 

Definition 3: A cluster tree is a tree T such that 

• Every leaf of T is a distinct symbol. 

• Every internal node of T has at least two children. 

• Each internal node of T is labelled with a non-negative value. Two or more nodes 
may be given the same value. (For the purposes of Sections 5-7, labels may be taken 
to be non-negative integers; in Section 8, it will be useful to allow rational labels.) 

• Every leaf of the tree is labelled 0. 

• The label of every internal node in the tree is less than the label of its parent. 

For any node N of T, the field 'W. symbols" gives the set of symbols in the leaves in the 
subtree of T rooted at N, and the field 'W. label" gives the integer label on node N. 
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Thus, for example, in Figure 1, n3.1abel=3 and n3. symbols = {a, c?}; nl. label = 5 and 
nl. symbols = {a, 6, c, d, e, /, g}. 

As we shall see, the nodes of the tree T represent the clusters of a set of points, and the 
labels represent the relative sizes of the diameters of the clusters. 

Definition 4: A valuation over a set of symbols is a function mapping each symbol to a 
point in an om-space. If T is a cluster tree, a valuation over T is a valuation over T. symbols. 
If N is any node in T and F is a valuation over T, we will write F(7V) as an abbreviation 
for F(iV. symbols). 

We now define how a cluster tree T expresses the order of magnitude relations over a 
set of points P. 

Definition 5: Let T be a cluster tree and let F be a valuation over T. Let P = F(T), the 
set of points in the image of T under F. We say that T\=T (read F satisfies or instantiates 
T) if the following conditions hold: 

i. For any internal node N of T, F(iV) is a cluster of P. 

ii. For any cluster C of P, there is a node N such that C=V{N). 

III. For any nodes M and N, if M.label < AT.label then odiam(F(M)) < odiam(F(7V)). 

iv. If label(M) = 0, then odiam(M) = 0. (That is, all children of M are assigned the 
same value under F.) 

The following algorithm generates an instantiation F given a cluster tree T: 

procedure instantiate(in T : cluster tree; Q, : an om-space) 

return : array of points indexed on the symbols of T 

variable G[N] : array of points indexed on the nodes of T; 

Let k be the number of internal nodes in T; 

Choose Jo = ^ (5i ^ ^2 ^ • • • ^ (5* to be H- 1 different orders of magnitude; 

/* Such values can be chosen by virtue of axiom A. 7 */ 

pick a point a; £ f2; 

G[root of T] := x; 

instantiatel(T, fl,Si . . . 5k, G); 

return the restriction of G to the symbols of T. 

end instantiate. 

instantiate! (in TV : a node in a cluster tree; : an om-space; 6i . . .6k ■ orders of magnitude; 

in out G : array of points indexed on the nodes of T) 
if is not a leaf then 

let Ci . . . Cp he the children of N; 

xi := G[N]; 

q := A^. label; 

pick points X2 . . .Xp such that 

for all i,j £ I .. .p. if i ^ j then od(xi,Xj) = Sq; 

I* Such points can be chosen by virtue of axiom A. 8 */ 
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for i = 1 . . .p do 

G[Ci] := Xi; 

instantiate! (Cj, fl,Si . . . 5k, G); 
endfor 
endif end instantiate!. 

Thus, we begin by picking orders of magnitude corresponding to the values of the labels. 
We pick an arbitrary point for the root of the tree, and then recurse down the nodes of the 
tree. For each node N , we place the children at points that all lie separated by the desired 
diameter of N . The final placement of the leaves is then the desired instantiation. 
Lemma 2: If T is a cluster tree and J7 is an om-space, then instantiate (T, Q) returns an 
instantiation of T. 

The proof is given in the appendix. 

Moreover, it is clear that any instantiation F of T can be generated as a possible output 
of instantiate(T, (Given an instantiation F, just pick G[N] at each stage to be F of some 
symbol oi N.) 

Note that, given any valuation F over a finite set of symbols S, there exists a cluster 
tree T such that T. symbols = S and F satisfies T. Such a T is essentially unique up to an 
isomorphism over the set of labels that preserves the label and the order of labels. 

5. Constraints 

In this section, we develop the first of our algorithms. Algorithm solve_constraints tests 
a collection of constraints of the form "a is much closer to b than c is to d," for consis- 
tency. If the set is consistent, then the algorithm returns a cluster tree that satisfies the 
constraints. The algorithm builds the cluster tree from top to bottom dealing first with the 
large distances, and then proceeding to smaller and smaller distances. 

Let 5 be a system of constraints of the form od(a, b) <^ od(c, rf): and let T be a cluster 
tree. We will say that ThS (read "T satisfies 5") if every instantiation of T satisfies S. In 
this section, we develop an algorithm for finding a cluster tree that satisfies a given set of 
constraints. 

The algorithm works along the following lines: Suppose we have a solution satisfying S. 
Let D be the diameter of the solution. If S contains a constraint od(a, b) <^ od(c, d) then, 
since od(c, d) is certainly no more than D, it follows that od(a, 6) is much smaller than D. 
We label ab as a "short" edge. 

If two points u and v are connected by a path of short edges, then by the triangle 
inequality the edge uv is also short (i.e. much shorter than D). Thus, if we compute the 
connected components H of all the edges that have been labelled short, then all these edges 
in H can likewise be labelled short. For example, in table 3, edges vz, wx, and xy can all 
be labelled "short". 

On the other hand, as we shall prove below, if an edge is not in the set H, then there is 
no reason to believe that it is much shorter than D. We can, in fact, safely posit that it is 
the same o.m. as D. We label all such edges "long". 

We can now assume that any connected component of points connected by short edges 
is a cluster, and a child of the root of the cluster tree. The root of the cluster tree is then 
given the largest label. Its children will be given smaller labels. Each "long" edge now 
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connects symbols in two different children of the root. Hence, any instantiation of the tree 
will make any long edge longer than any short edge. 

If no edges are labelled "long" — that is, if H contains the complete graph over the 
symbols — then there is an inconsistency; all edges are much shorter than the longest edge. 
For instance, in table 4, since vw, wx, and xy are all much smaller than zy, it follows 
by the triangle inequality that vy is much smaller than zy. But since we also have the 
constraints that zy is much smaller than vz and that vz is much smaller than vy, we have 
an inconsistency. 

The algorithm then iterates, at the next smaller scale. Since we have now taken care of 
all the constraints od(a, 6) <^ od(c, d), where cd was labelled "long", we can drop all those 
from S. Let D now be the greatest length of all the edges that remain in S. If a constraint 
od(a, b) ^ od(c, d) is in the new S, then we know that od(a, b) is much shorter than D, and 
we label it "short" . We continue as above. The algorithm halts when all the constraints in 
S have been satisfied, and S is therefore empty; or when we encounter a contradiction, as 
above. 

We now give the formal statement of this algorithm. The algorithm uses an undirected 
graph over the variable symbols in S. Given such a graph G, and a constraint C of the 
form od(a, 6) <^ od(c, d), we will refer to the edge ab as the "short" of C, and to the edge 
cd as the "long" of C. The shorts of the system S is the set of all shorts of the constraints 
of S and the longs of S is the set of all the longs of the constraints. An edge may be both a 
short and a long of S if it appears on one side in one constraint and on the other in another 
constraint. 

procedure solve_constraints(in S: a system of constraints of the form od(a, b) <C od(c, d)) 

return either a cluster tree T satisfying <S if <S is consistent; 
or false if S is inconsistent. 

type: A node N of the cluster tree contains 

pointers to the parent and children of A^; 
the field N. label, holding the integer label; 

and the field N. symbols, holding the list of symbols in the leaves of N. 

variables: m is an integer; 

C is a constraint in <S; 

H, I are undirected graphs; 

N, M are nodes of T; 

begin if S contains any constraint of the form, "od(a, 6) <^ od(c, c)" then return false; 
TO := the number of variables in iS; 
initialize T to consist of a single node N; 
A'.symbols:= the variables in <S; 

repeat H := the connected components of the shorts of S; 

if H contains all the edges in S then return (false) endif; 
for each leaf A^ of T do 

if not all vertices of N are connected in H then 
AMabel := m; 

for each connected component / of A'.symbols in H do 
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construct node M as a new child of A'' in T; 
M.symbols:= the vertices of /; 
endfor endif endfor 
S := the subset of constraints in S whose long is in H; 
m := m — 1; 
until S is empty; 

for each leaf iV of T 
iV.label := 0; 

if A''. symbols has more than one symbol 

then create a leaf of N for each symbol in A'^. symbols; 

label each such leaf 0; 
endif endfor end solve_constraints. 

Tables 3 and 4 give two examples of the working of procedure solve_constraints. Table 
3 shows how the procedure can be used to establish that the following constraints are 
consistent: 

The Empire State Building {x) is much closer to the Washington Monument {w) 
than to Notre Dame Cathedral (v). 

Bunker Hill (y) is much closer to the Empire State Building than to the Eiffel 
Tower {z). 

The distance from the Eiffel Tower to Notre Dame is much less than the distance 
from the Washington Monument to Bunker Hill. 

Table 4 shows that the following inference can be justified: 

Given: The distances from the Statue of Liberty (v) to the World Trade Center 
{w), from the World Trade Center to the Empire State Building {x), and from 
the Empire State Building to the Chrysler Building (y) are all much less than 
the distance from the Chrysler Building to the Washington Monument (z). 

Infer: The Washington Monument is not much nearer to the Chrysler Building 
than to the Statue of Liberty. 

This inference is carried out by asserting the negation of the consequent, "The Washing- 
ton Monument is much nearer to the Chrysler Building than to the Statue of Liberty," and 
showing that that collection of constraints is inconsistent. Note that if we change "much 
less" and "much nearer" in this example to "less" and "nearer" , then the inference no longer 
valid. 

Theorem 1 states the correctness of algorithm solve_constraints. The proof is given in 
the appendix. 

Theorem 1: The algorithm solve_constraints(5) returns a cluster tree satisfying 5 if 5 is 
consistent, and returns false if <S is inconsistent. 

There may be many cluster trees that satisfy a given set of constraints. Among these, 
the cluster tree returned by the algorithm solve_constraints has an important property: it 
has the fewest possible labels consistent with the constraints. In other words, it uses the 
minimum number of different orders of magnitude of any solution. Therefore, the algorithm 
can be used to check the satisfiability of a set of constraints in an om-space that violates 
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S contains the constraints 

1. od{w,x) <^ od(a;,?;). 

2. od{x,y) < od{y,z). 

3. od{v,z) <^ od{w,y). 

The algorithm proceeds as follows: 
Initialization: 

The tree is initializes to a single node with nl. 

nl. symbols := { v,w,x,y,z }. 

First iteration: 

The shorts of 5 are { wx,xy,vz }. 

Computing the connected components, H is set to { wx,xy,wy,vz }. 
nl. label := 5; 

Two children of nl are created: 

nil. symbols := w,x,y; 

nl2. symbols := v,z; 
As XV is not in H, delete constraint #1 from S. 
As yz is not in H, delete constraint #2 from S. 
S now contains just constraint #3. 

Second iteration: 

The shorts of S are { vz }. 

The connected components H is just {vz}. 

nil. label := 4; 

Three children of nil are created: 

nlll. symbols := w; 

nll2. symbols := x; 

nll3. symbols := z; 
As wy is not in H, delete constraint #3 from S. 
S is now empty. 

Cleanup: 

nl2. label := 0; 

Two children of nl2 are created: 
nl21. symbols := v; 
nl22. symbols := z; 

(See Figure 2.) 

Table 1: Example of computing a cluster tree 
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w X y V z 



Figure 2: Building a cluster tree 
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S contains the constraints 

od{v,w) <^ od{z,y). 
od{w,x) <^ od{z,y). 
od{x,y) < od{z,y). 
od{z,y) < od{v,z). 

The algorithm proceeds as follows: 
Initialization: 

The tree is initializes to a single node with nl. 

nl. symbols := { v,w,x,y,z }. 

First iteration: 

The shorts of S are { vw, wx, xy, zy, vz }. 

H is set to its connected components, which is the complete graph over v,w,x,y, z. 
The algorithm exits returning false 

Table 2: Example of determining inconsistency 

axiom A. 7 and has only finitely many different orders of magnitude. If the algorithm returns 
T and T has no more different labels than the number of different orders of magnitude in 
the space, then the constraints are satisfiable. If T uses more labels than the space has 
orders of magnitude, then the constraints are unsatisfiable. 

The proof is easier to present if we rewrite algorithm solve_constraints in the following 
form, which returns only the number of different non-zero labels used, but does not actually 
construct the cluster tree.^ 

function num_labels(<S); 

if iS is empty then return(O) 

else return(l + num_labels(reduce_constraints(iS))) 

function reduce_constraints(iS) 

H := connected components of the shorts of S; 

if H contains all the edges in S then return(false) to top-level 

else return(the set of constraints in <S whose long is in H) 

It is easily verified that the sequence of values of S in successive recursive calls to 
numJabels is the same as the sequence of values of S in the main loop of solve_constraints. 
Therefore numJabels returns the number of different non-zero labels in the tree constructed 
by solve_constraints. 



1. The reader may wonder why this simpler algorithm was not presented before the more complicated 
algorithm solve_constraints. The reason is that the only proof we have found that the system of con- 
straints is consistent if numJabels does not return false relies on the relation between numJabels and 
the constructive solve_constraints. 
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Theorem 2: Out of all solutions to the set of constraints 5, the instantiations of 
solve_constraints((S) have the fewest number of different values of od(a, b), where a, b range 
over the symbols in S. This number is given by num_labels((S). 
The proof is given in the appendix. 

6. Extensions and Consequences 

We next present a number of modifications of the algorithm solve_constraints. The first 
is a more efficient implementation. The second extends the algorithm to handle non-strict 
comparisons. The third extend the algorithm to handle a combination of order-of-magnitude 
comparisons on distance with order comparisons, in a one-dimensional space. 

6.1 An Efficient Implementation of Solve_constraints 

It is possible to implement algorithm solve_constraints somewhat more efficiently than the 
naive encoding of the above description. The key is to observe that the graph H of connected 
components does not have to be computed explicitly: it suffices to compute it implicitly using 
merge-find sets (union-find sets). Combining this with suitable back pointers from edges to 
constraints, we can formulate a more efficient version of the algorithm. 
We use the following data structures and subroutines: 

• Each node N of the cluster tree contains pointers to its parents and children; a field 
A'^. label, holding the integer label; a field JV. symbols, holding the list of symbols in 
the leaves of N; and a field A^'.mfsets, holding a list of the connected components of 
the symbols in N. As described below, each connected component is implemented as 
an merge-find set (MFSET). 

• An edge E in the graph over symbols contains its two endpoints, each of which is a 
symbol; a field shorts, a list of the constraints in which E appears as a short; and 
a field longs, a list of the constraints in which E appears as a long. 

• A constraint C has two fields, Cshort and C.long, both of them edges. It also has 
pointers into the lists Cshort. shorts and C.long.longs, enabling C to be removed in 
constant time from the constraint lists associated with the individual edges. 

• We will use the disjoint-set forest implementation of MFSETs (Gormen, Leiserson, 
and Rivest, 1990, p. 448) with merging smaller sets into larger and path-compression. 
Thus, each MFSET is a upward-pointing tree of symbols, each node of the tree being 
a symbol. The tree as a whole is represented by the symbol at the root. A symbol A 
has then the following fields: 

— j4. parent is a pointer to the parent in the MFSET tree. 

— j4.cluster_leaf is a pointer to the leaf in the cluster tree containing A. 

— If ^ is the root of the MFSET then ^.size holds the size of the MFSET. 

— If j4 is the root of the MFSET, then ^4. symbols holds all the elements of the 
MFSET. 
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— If ^ is the root of the MFSET then ^.leaf_ptr holds a pointer to the pointer to 
A in TV.mfsets where TV = ^.clusterJeaf. 

We can now describe the algorithm. 

procedure solve_constraintsl(in S: a system of constraints of the form od(a, fo) <C od(c, d)). 

return either a cluster tree T satisfying <S if <S is consistent; 
or false if S is inconsistent. 

variables: m is an integer; 

a, b are symbols; 

C is a constraint in S; 

H is an undirected graph; 

E, F are edges; 

P is an MFSET; 

N, M are nodes of T; 

0. begin if 5 contains any constraint of the form, "od(a, fo) <§; od(c, c)" then return false; 



1. H := 0; 

2. for each constraint C in 5 with short E and long F do 

3. add E and F to H; 

4. add C to B. shorts and to F. longs endfor; 

5. TO := the number of variables in S; 

6. initialize T to contain the root A^; 

7. A'^. symbols := the variables in S; 

8. repeat for each leaf N of T, INITIALIZE.MFSETS(iV); 

9. for each edge E = ab in H do 

10. if E.shorts is non-empty and FIND(a) ^ FIND(6) then 

1 1 . MERGE (FIND (a) , FIND (6) ) endif endfor 

12. if every edge E = ab in H satisfies FIND(a) = FIND(6) 

13. then return(false) endif 

14. for each current leaf A?^ of T do 

15. if TV.mfsets has more than one element then 

16. for each mfsct P in A'^.mfsets do 

17. construct node M as a new child of A^ in T; 

18. M.symbols:= P. symbols; 

19. endfor endif endfor 

20. for each edge E = ab in H do 

21. if FIND(a) ^ FIND(6) then 

22. for each constraint C in S. longs do 

23. delete C from iS; 

24. delete C from B. longs; 

25. delete C from C. short. shorts endfor 

26. delete E from H endif endfor 

27. TO := TO - 1; 

28. until S is empty; 

29. for each leaf A^ of T 

30. A^.label := 0; 

31. if A^. symbols has more than one symbol 
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32. 
33. 



then create a leaf of N with label for each symbol in A'^. symbols; 
endif endfor end solve_constraintsl. 



procedure INITIALIZE.MFSETS(iV : node) 

var A : symbol; 

iV.mfsets := 0; 

for A in A/". symbols do 

A. parent := null; 

A. cluster-leaf := N; 

A. symbols := {A}; 

A. size := 1; 

iV.mfsets := cons(A,7V.mfsets); 
A.leaf-ptr := A^'.mfsets; 
endfor end INITIALIZE_MFSETS. 

procedure MERGE(in A, B : symbol) 
if A. size > B.size then swap(A, B); 

A. parent := B\ 

B. size := B.size + A. size; 

B. symbols := S. symbols U A. symbols; 

Using A.leaf_ptr, delete A from A.clusterJeaf.mfsets; 

end MERGE. 

procedure FIND (in A : symbol) return symbol; 
var R : symbol; 

if A. parent = null then return A 
else R := FIND(A.parent); 

A. parent := R; /* Path compression */ 

return(iZ) 
end FIND. 



Let n be the number of symbols in S; let e be the number of edges; and let s be the 
number of constraints. Note that n/2 < e < n{n — l)/2 and that e/2 < s < e(e — l)/2. 
The running time of solve_constraintsl can be computed as follows. As each iteration of the 
main loop 8-28 splits at least one of the connected components of H. there can be at most 
n — 1 iterations. The MERGE-FIND operations in the for loop 9-11 take together time 
at most 0{max{na{n),e)) where a{n) is the inverse Ackermann's function. Each iteration 
of the inner for loop lines 16-18 creates one node M of the tree. Therefore, there are 
only 0{n) iterations of this loop over the entire algorithm. Lines 14, 15 of the outer for 
loop require at most n iterations in each iteration of the main loop. The for loop 22-26 
is executed exactly once in the course of the entire execution of the algorithm for each 
constraint C, and hence takes at most time 0{s) over the entire algorithm. Steps 20-21 
require time 0(e) in each iteration of the main loop. It is easily verified that the remaining 
operations in the algorithm take no more time than these. Hence the overall running time 
is O (max(n^ Q;(n), ne, s)). 
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6.2 Adding Non-strict Comparisons 

The algorithm solve_constraints can be modified to deal with non-strict comparisons of the 
form od(a, b) ^ od(c, d) by, intuitively, marking the edge ab as "short" on each iteration if 
the edge cd has been found to be short. 

Specifically, in algorithm solve_constraints, we make the following two changes. First, 
the revised algorithm takes two parameters: 5, the set of strict constraints, and W, the set 
of non-strict constraints. Second, we replace the line 

H := the connected components of the shorts of <S 

with the following code: 

1. H := the shorts of S; 

2. repeat H := the connected components of H; 

3. for each weak constraint od(a, b) ^ od(c, d) 

4. if cd is in H then add ab to H endif endfor 

5. until no change has been made to H in the last iteration. 

The proof that the revised algorithm is correct is only a slight extension of the proof of 
theorem 1 and is given in the appendix. 

Optimizing this algorithm for efficiency is a little involved, not only because of the new 
operations that must be included, but also because there are now four parameters — n, the 
number of symbols; e, the number of edges mentioned: s, the number of strict comparison; 
and w, the number of non-strict comparisons — and the optimal implementation varies 
depending on their relative sizes. In particular, either s or w, though not both, may be 
much smaller than n, and each of these cases requires special treatment for optimal efficiency. 
The best implementation we have found for the case where both s and w are ri(n) has a 
running time of 0(max(n^, nu;, s)). The details of the implementation are straightforward 
and not of sufficient interest to be worth elaborating here. 

An immediate consequence of this result is that a couple of problems of inference are 
easily computed: 

• To determine whether a constraint C is the consequence of a set of constraints 5, 
form the set S U -iC and check for consistency. If 5 U -iC is inconsistent then S\=C. 
Note that the negation of the constraint od(a, b) <C od(c, d) is the constraint 

od(c, d) ^ od(a, b). 

• To determine whether two sets of constraints are logically equivalent, check that each 
constraint in the first is a consequence of the second, and vice versa. 

6.3 Adding Order Constraints 

Example 3 of Section 2 involves a combination of order-of-magnitude constraints on dis- 
tances together with simple ordering on points, where the points lie on a one-dimensional 
line. We next show how to extend algorithm solve_constraints to deal with this more com- 
plex situation. 
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In terms of the axiomatics, adding an ordering on points involves positing that the 
relation p < q is a total ordering and that the ordering of points is related to order of 
magnitude comparisons of distances through the following axiom. 

A. 9 For points a,b,c & V, ii a < b < c then od(a, b) ^ od(a, c). 

The following rule is easily deduced: If C and D are disjoint clusters, then either every 
point in C is less than all the points in D, or vice versa. 

In extending our algorithm, we begin by defining an ordered cluster tree to be a cluster 
tree where, for every internal node N, there is a partial order on the children of TV. If A 
and B are children of N and A is ordered before B, then in an instantiation of the tree, 
every leaf of A must precede every leaf of B. Procedure instantiate! can then be modified 
to deal with ordered cluster trees as follows: 

instantiate! (in N : a node in a cluster tree; : an cm-space; Si . . .Sk ■ orders of magnitude; 

in out G : array of points indexed on the nodes of T) 
if N is not a leaf then 

let Ci . . . Cp be the children of N in topologically sorted order; 

xo := G[N]; 

q := iV. label; 

pick points xi . . .Xp in increasing order such that 

for all i, j e . . .p, if i ^ j then od(a;j,a;j) = Sg-, 

/* Such points can be chosen by virtue of axiom A. 8 */ 
for i = 1 . . .p do 

G[Ci] := Xi; 

instantiate! (C,, il,6i . . . 6k, G) 
endfor 
endif end instantiate!. 

Algorithm solve_constraints is modified as follows: 

procedure solve_constraints2(in S: a system of constraints of the form od(a, 6) od(c, d) ; 
{NEW} O : a system of constraints of the form a <b) 

return either an ordered cluster tree T satisfying <S 
if S is consistent; 
or false if S is inconsistent. 

variables: m is an integer; 

C is a constraint in S; 
H, I are undirected graphs; 
M, N, P are nodes of T; 
a, b, c, d are symbols; 

begin if 5 contains any constraint of the form, "od(a, fo) od(c, c)" 

then return false; 
{NEW} if is internally inconsistent (contains a cycle) then return false; 

TO := the number of variables in S; 

initialize T to consist of a single node N; 

iV.symbols:= the variables in <S; 

repeat H := the connected component of the shorts of S; 
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{NEW} H := incorporate_ordcr(i7, O); 

if H contains all the edges in <S then return false 

for each leaf iV of T do 

if not all vertices of N are connected in H then 
A''. label := m; 

for each connected component / of iV. symbols in H do 
construct node M as a new child of TV in T; 
M.symbols:= the vertices of /; 

endfor endif 



{NEW} for each constraint a <b £ O 
{NEW} if a is in M. symbols and b is in P. symbols 

{NEW} where M and P are different children of TV 

{NEW} then add an ordering arc from M to P; 

{NEW} endif endfor 



endfor 



<S := the subset of constraints in <S whose long is in H; 
m := m — 1; 
until S is empty; 



for each leaf iV of T 
iV.label := 0; 

if TV. symbols has more than one symbol 

then create a leaf of N for each symbol in TV. symbols; 

label each such leaf 0; 
endif endfor 
end solve_constraints2. 



{NEW} 

function incorporate_order(in H : undirected graph; 

O : a system of constraints of the form a < b) 
return undirected graph; 

variables: G : directed graph; 

a,b : vertices in H; 

A, B : connected components of H; 

V[A\ : array of vertices of G indexed on connected components of H; 
I : subset of vertices of G; 



for each connected component A oi H create a vertex V[A] in G; 
for each constraint a < b G O 

let A and B be the connected components of H containing a and 6 respectively; 

if A ^ B then add an arc in G from V[A\ to V[B] endif endfor; 
for each strongly connected component I of G do 

for each pair of distinct vertices e 7 do 

for each a G A and b G B add the edge ab to H endfor endfor 

endfor 



20 



Order of Magnitude Comparisons of Distance 



end incorporate_order. 

Function incorporate_order serves the following purpose. Suppose that we are in the 
midst of the main loop of solve_constraints2, we have a partially constructed cluster tree, 
and we are currently working on finding the sub-clusters of a node N. As in the original 
form of solve_constraints, we find the connected components of the shorts of the order-of- 
magnitude constraints. Let these be Ci . . . C^; then we know that the diameter of each Ci 
is much smaller than the diameter of N. Now, suppose, for example, that we have in O the 
constraints ai < 0,5,65 < 625^2 < ci, where ai,ci G Ci; 625^2 G C2; and 05,65 G C5. Then 
it follows from axiom A. 9 that Ci, C2, and C5 must all be merged into a single cluster, 
whose diameter will be less than the diameter of N. Procedure incorporate_order finds all 
such loops by constructing a graph G whose vertices are the connected components of H 
and whose arcs are the ordering relations in O and then computing the strongly connected 
components of G. (Recall that two vertices u, v in a directed graph are in the same strongly 
connected component if there is a cycle from it to to n.) It then merges together all of 
the connected components of H that lie in a single strongly connected component of G. 

The proof of the correctness of algorithm solve_constraints2 is again analogous in struc- 
ture to the proof of theorem 1. and is given in the appendix. 

By implementing this in the manner of Section 6.1, the algorithm can be made to run 
in time 0(max(n^Q;(n), ne, no, s)), where o is the number of constraints in O. 

7. Finite order of magnitude comparison 

In this section, it is demonstrated that algorithm solve_constraints can be applied to systems 
of constraints of the form "dist(a, 6) < dist(c, d) / B'' for finite B in ordinary Euclidean 
space as long as the number of symbols in the constraint network is smaller than B. 

We could be sure immediately that some such result must apply for finite B. It is 
a fundamental property of the non-standard real line that any sentence in the first-order 
theory of the reals that holds for all infinite values holds for any sufficiently large finite 
value, and that any sentence that holds for some infinite value holds for arbitrarily large 
finite values. Hence, since the answer given by algorithm solve_constraints works over a 
set of constraints S when the constraint "od(a, 6) <^ od(c, c?)" is interpreted as "od(a, 6) 
< od{c,d)/B for infinite S", the same answer must be valid for sufficiently large finite B. 
What is interesting is that we can find a simple characterization of B in terms of S; namely, 
that B is larger than the number of symbols in S. 

We begin by modifying the form of the constraints, and the interpretation of a cluster 
tree. First, to avoid confusion, we will use a four-place predicate "much_closer(a, 6, c, c?)" 
rather than the form "od(a, 6) <^ od(c, d)" as we are not going to give an interpretation to 
"od" as a function. We fix a finite value B > 1, and interpret "much_closer(a, 6, c, d)" to 
mean "dist(a,6) < dist(c, d) / 5." 

We next redefine what it means for a valuation to instantiate a cluster tree: 

Definition 6: Let T be a cluster tree and let F be a valuation on the symbols in T. We say 
that FhT if the following holds: For any symbols a, 6, c, d in T, let M be the least common 
ancestor of a, b and let N be the least common ancestor of c, d. If M. label < A/". label then 
much_closer(a, 6, c, d). 
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Procedure "instantiate" , which generates an instantiation of a cluster tree, is modified 
as follows: 

procedure instantiate(in T : cluster tree; f2 : Euclidean space; B : real); 

return : array of points indexed on the symbols of T; 

Let n be the number of nodes in T; 
a := 2 + 2n + Bn; 

Choose 61,62 ■■ - Sn such that 6i < 6i+i/a; 
pick a point x G fl; 

G[T] := x: 

instantiate! (T, il,6i . . . 6n, G); 

return the restriction of G to the symbols of T. 

end instantiate. 

instantiate! (in TV : a node in a cluster tree; : a Euclidean space; 
Si . . .6n : orders of magnitude; 
in out G : array of points indexed on the nodes of T) 
if N is not a leaf then 

let Ci . . . Cp be the children of N; 
XI := G[N]; 
q := iV. label; 

pick points X2 ■ ■ - Xp such that 

for all i,j G 1 . . .p, if i ^ j then 6g < dist(x,,Xj) < n6g 
/* This is possible since p < n. */ 
for i = 1 . . .p do 

G[Ci] := Xi; 

instantiate! (Cj, Vl,6i . . . 6n, G) 
endfor 
endif end instantiate!. 

The analogue of lemma 2 holds for the revised algorithm: 
Lemma 22: Any cluster tree T has an instantiation in Euclidean space JR™ of any dimen- 
sionality m. 

We can now state theorem 3, which asserts the correctness of algorithm "solve_constraints" 
in this new setting: 

Theorem 3: Let 5 be a set of constraints over n variables of the form "dist(a,6) < 
dist(c, d) / S", where B > n. The algorithm solve_constraints(5) returns a cluster tree 
satisfying 5 if 5 is consistent over Euclidean space, and returns false if 5 is inconsistent. 

The proofs of lemma 22 and theorem 3 are given in the appendix. 

An examination of the proof of lemma 22 shows that this result does not depend on 
any relation between n and B. Therefore, if solve_constraints(5) returns a tree T, then S 
is consistent and T satisfies S regardless of the relation between n and B. However, it is 
possible for S to be consistent and solve_constraints(5) to return false ii n > B. On the 
other hand, one can see from the proof of theorem 3 (particularly lemma 23) that ii B > n 
and solve_constraints(5) returns false then S is inconsistent in any metric space. However, 
there are metric spaces other than in which the cluster tree returned by solve_constraints 
may have no instantiation. 
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8. The first-order theory 

Our final result asserts that if the om-space is rich enough then the full first-order language 
of order-of-magnitude distance comparisons is decidable. Specifically, if the collection of 
orders of magnitude is dense and unbounded above, then there is a decision algorithm for 
first-order sentences over the formula, "od(VF, X) <^ od{Y, Zy^ that runs in time 0(4"(n!)^s) 
where n is the number of variables in the sentence and s is the length of the sentence. 

The basic reason for this is the following: As we have observed in corollary 4, a cluster 
tree T determines the truth value of all constraints of the form "od(a, b) <^ od(c, d)'' where 
a, 6, c, d are symbols in the tree. That is, any two instantiations of T in any two om- 
spaces agree on any such constraint. If we further require that the om-spaces are dense 
and unbounded, then a much stronger statement holds: Any two instantiations of T over 
such om-spaces agree on any first-order formula free in the symbols of T over the relation 
"od(Ty, X) <C od(y, Z)" . Hence, it suffices to check the truth of a sentence over all possible 
cluster trees on the variables in the sentence. Since there are only finitely many cluster 
trees over a fixed set of variables (taking into account only the relative order of the labels 
and not their numeric values), this is a decidable procedure. 

Let C be the first-order language with equality with no constant or function symbols, 
and the single predicate symbol "much_closer(a, 6, c, d)". It is easily shown that C is as 
expressive as the language with the function symbol "od" and the relation symbol <^. 

Definition 7: An om-space Q with orders of magnitude V is dense if it satisfies the 
following axiom: 

A. 9 For all orders of magnitude 6i <^ 6s in there exists a order of magnitude 62 in V 
such that Si <^ 62 <^ 63. 

is unbounded above if it satisfies the following: 

A. 10 For every order of magnitude in V there exists 62 in V such that 61 <^ 62- 

If V is the collection of orders of magnitude in the hyperreal line, then both of these 
are satisfied. In axiom [A. 9], if ^ ^ ^3, choose 82 = VSiS^, the geometric mean. 
If = 5i ^3, choose 62 = 636 where 6 <^ 1. In axiom [A. 10] choose 62 = 61/6 where 
< 5 < 1. 

Definition 8: Let T be a cluster tree. Let /q = O.I1.I2 ■ ■ ■ h be the distinct labels in T 
in ascending order. An extending label for T is either (a) li for some i; (b) lk + ^ (note that 
Ik is the label of the root); (c) {k-i + li)/2 for some i between 1 and k. 

Note that if T has k distinct non-zero labels, then there are 2k + 2 different extending 
labels for T. 

Definition 9: Let T be a cluster tree. Let re be a symbol not in T. The cluster tree 
T' extends T with x if T' is formed from T by applying one of the following operations (a 
single application of a single operation). 

1. T is the null tree and T' is the tree containing the single node x. 



23 



Davis 



2. T consists of the single node for symbol y. Make a new node M, make both x and y 
children of M, and set the label of M to be either or 1. 

3. For any internal node TV of T (including the root), make x a child of N . 

4. Let y be a symbol in T, and let N be its father. If TV.label / 0, create a new node M 
with an extending label for T such that M. label < TV. label. Make M a child of TV, 
and make x and y children of M. 

5. Let C be an internal node of T other than the root, and let N be its father. Create 
a new node M with an extending label for T such that C. label < M. label < TV. label. 
Make M a child of TV and make x and C children of M. 

6. Let R be the root of T. Create a new node M such that TW. label = i?.label + 1. Make 
R and x children of M. Thus M is the root of the new tree T' . 

(See Figure 3.) 

Note that if T is a tree of n symbols and at most n — 1 internal nodes then 

• There are n — 1 ways to carry out step 3. 

• There are n possible ways to choose symbol y in step 4, and at most 2n — 2 for the 
label on M in each. 

• There are at most n — 2 different choices for C in step 5, and at most 2n — 3 choices 
for the label on M in each. 

• There is only one way to carry out step 6. 

Hence, there are less than An? different extensions of T by x. (This is almost certainly 
an overestimate by at least a factor of 2, but the final algorithm is so entirely impractical 
that it is not worthwhile being more precise.) 

Definition 10: Let T be a cluster tree, and let be a formula of C open in the variables 
of T. T satisfies 4> if every instantiation of T satisfies 4>. 

Theorem 4: Let T be a cluster tree. Let 4> be an open formula in £, whose free variables 
are the symbols of T. Let be an om-space that is dense and unbounded above. Algorithm 
decide(T, 4>) returns true if T satisfies 4> and false otherwise. 

function decide(T : cluster tree; : formula) return boolean 
convert 4> to an equivalent form in which the only logical symbols in are 
-I (not), A (and), 3 (exists), = (equals) and variable names, 
and the only non-logical symbol is the predicate "much_closer" . 

case 

(j) has form X = Y: return (distance (X, F, T) = 0); 

<f) has form "much_closer(iy, X, Y", Z)" : return distance(iy, X, T) < distance (F, Z, T)); 

<f) has form -i^: return not(decide(T, ^)) 

4) has form ?/; A 6: return(decide(T, ^) and decide(T, ^)) 

(f> has form 3x<a;; 
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if for some extension T' of T by X, decide (T', a) = true 
then return true 
else return false endif endcase 

end decide 

function distance(X, Y : symbol; T : cluster tree) return integer 
N := the common ancestor of X and Y in T; 
return(7V. label) 
end distance 

The proof of theorem 4 is given in the appendix. 

Running time: As we have remarked above, for a tree T of size k there are at most 4:k^ 
extensions of T to be considered. The total number of cluster trees considered is therefore 
bounded by n^_^4A;^ = 4"(n!)^. It is easily verified that the logical operators other than 
quantifiers add at most a factor of s where s is the length of the sentence. Hence the running 
time is bounded by (9(4"(n!)^,s). 

A key lemma, of interest in itself, states the following: 
Lemma 28: Let T be a cluster tree. Let be an open formula in £, whose free variables 
are the symbols of T. Let U be an om-space that is dense and unbounded above. If one 
instantiation F of T in Q satisfies (p then every instantiation of T in Q satisfies (p. 

That is, either 4' is true for all instantiations of T or for none. The proof is given in the 
appendix. 

It should be observed that the above conditions on Q in lemma 28 are necessary, and 
that the statement is false otherwise. For example, let Q be the om-space described in 
example I, Section 3, of polynomials over an infinitesimal S. Then is not unbounded 
above; there is a maximum order-of-magnitude 0(1). Let T be the starting tree of Figure 
3 (upper- left corner). Let be the formula "3x od{V,W) < od(VF,X)", free in V and W. 
Then the valuation {U — > (5, F — > 0, W — > 1} satisfies T but not 0, whereas the valuation 
{U ^ 6'^,V ^2S'^,W ^ 6} satisfies both T and 4. 

9. Conclusions 

The applications of the specific algorithms above are undoubtedly limited; we are not aware 
of any practical problems where solving systems of order-of-magnitude relations on distances 
is the central problem. However, the potential applications of order-of-magnitude reasoning 
generally are very widespread. Ordinary commonsense reasoning involves distances span- 
ning a ratio of about 10^, from a fraction of an inch to thousands of miles, and durations 
spanning a ratio of about 10^", from a fraction of a second to a human lifetime. Scientific 
reasoning spans much greater ranges. Explaining the dynamics of a star combines reasoning 
about nuclear reactions with reasoning about the star as a whole; these differ by a ratio 
of about 10^^. The techniques needed to compute with quantities of such vastly differing 
sizes are quite different from the techniques needed to compute with quantities all of similar 
sizes. This paper is a small step in the development and analysis of such computational 
techniques. 

The above results are also significant in the encouragement that they give to the hope 
that order-of-magnitude reasoning specifically, and qualitative reasoning generally, may lead 
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to useful quick reasoning strategies in a broader range of problems. It has been often found 
in AI that moving from greater to lesser precision in the mode of inference or type of 
knowledge does not lead to quick and dirty heuristic techniques, but rather to slow and 
dirty techniques. Nonmonotonic reasoning is the most notorious example of this, but it 
arises as well in many other types of automated reasoning, including qualitative spatial and 
physical reasoning. The algorithms developed in this paper are a welcome exception to 
this rule. We are currently studying algorithmic techniques for other order-of-magnitude 
problems, and are optimistic of finding similar favorable results. 
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Appendix A. Proofs 

In this appendix, we give the proofs of the various results asserted in the body of the paper. 
Proof of Lemma 2 

Lemma 2: If T is a cluster tree and Q is an om-space, then instantiate (T, U) returns an 
instantiation of T. 

Proof: Let So = 0. For any node N, if «=7V. label, we define A{N) = Si. The proof then 
proceeds in the following steps: 

i. For any nodes M,iV, if M is a descendant of AT in T then od(G[M], G[N]) < A{N). 
Proof: If M is a child of TV, then this is immediate from the construction oi X2 ■ ■ - Xp 
in instantiate!. Else, let N = Ni, N2 . . . Ng = M be the path from N to M through 
T. By the definition of a cluster tree, it follows that A^j. label < A'. label, for i > I and 
therefore A(Aj) < A(iV). Thus od{G[M],G[N]) < (by the o.m.-triangle inequality) 
maxj^i...g_i(od(G[Aj+i], G[Aj])) < maxj^i...g_i(A(A^j)) (since A^j+i is the child of 
AT,) < A(7V). 

ii. Let AT be a node in T; let Ci and C2 be two distinct children of N; and let Mi 
and M2 be descendants of Ci and C2 respectively. Then od(G[Mi], G[M2]) = A{N). 
Proof: By the construction ofx2- ■ - Xp in instantiate! (A"), od(G[Ci], G[C2]) = A(A'). 
By part (i.), od(G[Mi], G[Ci]) < A(Ci) < A{N) and likewise od(G[M2], G[C2]) < 
A{N). Hence, by axiom A.6, od(G[Mi], G[M2]) = A{N). 

iii. Let a and b be any two leaves in T, and let A" be the least common ancestor in T of 
a and b. Then od{G[a],G[b]) = A{N). Proof: Immediate from (ii). 

iv. For any node A^, odiam(r(A^)) = A (A"). Proof: From (iii), any two leaves descending 
from different children of A^ are at a distance of order A(A^), and no two leaves of A" 
are at a distance of order greater than A(A^). 
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V. For any node N, T{N) is a cluster of r(T). Proof: Let a and b be leaves of TV, 
and let c be a leaf of T — N. Let I be the common ancestor of a and 6 in T and 
let J be the common ancestor of a and c. Then / is either TV or a descendant of N 
and J is a proper ancestor of TV. Therefore by part (i), A(I) <^ A(J). But by (iii), 
od(r(a),r(6)) = A(/) < A(J) = od(r(a),r(c)). 

vi. For any internal nodes TV, M if TW.label < TV.label then odiam(r(TW)) < odiam(r(TV)). 
Proof: Immediate from (iv) and the construction of A. 

vii. If C is a cluster of T{T) then there is a node TV in T such that C = r(TV). Proof: Let 
S be the set of symbols corresponding to C and let TV be the least common ancestor 
of all of S. Let a and b be two symbols in S that are in different subtrees of TV. Then 
by (iii), od(G[a], = A(TV). Let x be any symbol in TV. symbols. Then by (iii) 
od{G[a],G[x]) < A{N). Hence G[x] G C. 

□ 

Proof of Theorem 1 

We here prove the correctness of algorithm solve_constraints. We will assume throughout 
that the two variables in the long of any constraint in S are distinct. 

Lemma 3: Let T be a cluster tree and let T be an instantiation of T. Let a and b be 
symbols of T. Let TV be the least common ancestor of a and b in T. Then od(r(a), T{b)) = 
odiam(r(TV)). 

Proof: Since T{a) and r{b) are elements of r(TV), it follows from the definition of odiam that 
od(r(a), r(6)) ^ odiam(r(TV)). Suppose the inequality were strict; that is, od(r(a), r(6)) 

< odiam(r(TV)). Then let C be the set of all the symbols c of T such that od(r(a), r(c)) 

< od(r(a),r(6)). Then odiam(r(C)) = od(r(a), r(6)) < odiam(r(TV)). It is easily shown 
that r(C) is a cluster in r(T). Therefore, by property (ii) of definition 5, there must be 
a node M such that TW. symbols = C. Now, M is certainly not an ancestor of TV, since 
odiam(r(TW)) <^ odiam(r(TV)) but Tkf. symbols contains both a and b. But this contradicts 
the assumption that TV was the least common ancestor of a and b. □ 

Corollary 4: Let T be a cluster tree and let T be an instantiation of T. Let a, b, c, d be 
symbols of T. Let TV be the least common ancestor of c and d in T, and let M be the 
least common ancestor of a and b in T. Then od(r(a), T{b)) <^ od(r(c), r(c?)) if and only if 
Tkf.label < TV.label. 

Proof: Immediate from lemma 3 and property (iii) of definition 5 of instantiation. □ 

Lemma 5: Let S be any set of constraints of the form od(a, b) <^ od(c, d). Let H be the 
connected components of the shorts of S. If S is consistent, then not every edge of S is in 
H. 

Proof: Let F be a valuation satisfying S. Find an edge pq in S for which od(F(p), T{q)) is 
maximal. Now, if ab is a short of 5 — that is, there is a constraint od(a, b) <C od(c, d) in 
S — then od(r(a),r(6)) < od(r(c), r(d)) < od{T{p),T{q)). 
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Now, let ab be any edge in H, the connected components of the shorts of S. Then there 
is a path a,i = a, 02 • • • «A- = ^ such that the edge is a short of 5 for i = 1 ... A; — 1. 

Thus, by the om-triangle inequality, od(r(a), r(6)) ^ maxj^i,,yt_i(od(r(aj), r(aj_|_i))) <C 
od(r(p), r((7)). Hence pq / ab, so pq is not in H. □ 

Lemma 6: The values of S and in any iteration are supersets of their values in any later 
iteration. 

Proof: S is reset to a subset of itself at the end of each iteration. H is defined in terms of 
(S in a monotonic manner. □ 

Lemma 7: S cannot be the same in two successive iterations of the main loop. 

Proof: by contradiction. Suppose that S is the same in two successive iterations. Then H 
will be the same, since it is defined in terms of S. H is constructed to contain all the shorts 
of 5, Since the resetting of S at the end of the first iteration does not change 5, H must 
contain all the longs as well. Thus, H contains all the edges in S. But that being the case, 
the algorithm should have terminated with failure at the beginning of the first iteration. □ 

Lemma 8: Algorithm solve_constraints always terminates. 

Proof: By lemma 7, if the algorithm does not exit with failure, then on each iteration some 
constraints are removed from S. Hence, the number of iterations of the main loop is at 
most the original size of S. Everything else in the algorithm is clearly bounded. (Note that 
this bound on the number of iterations is improved in Section 6.1 to n — 1, where n is the 
number of symbols.) □ 

Lemma 9: If algorithm solve_constraints returns false, then S is inconsistent. 

Proof: If the algorithm returns false, then the transitive closure of the shorts of S contains 
all the edges in S. By lemma 5, 5 is inconsistent. 

Lemma 10: If constraint C of form od(a, b) <^ od(c, d) is in the initial value of 5, and 
edge cd is in H in some particular iteration, then constraint C is in 5 at the start of that 
iteration. 

Proof: Suppose that C is deleted from S on some particular iteration. Then edge cd, the 
long of C, cannot be in H in that iteration. That is, it is not possible for edge cd to persist 
in H in an iteration after C has been deleted from S. Note that, by lemma 6, once cd is 
eliminated from H, it remains out of H. □ 

Lemma 11: The following loop invariant holds: At the end of each loop iteration, the 
values of L. symbols, where L is a leaf in the current state of the tree, are exactly the 
connected components of H. 

Proof: In the first iteration, T is initially just the root R, containing all the symbols, and 
a child of R is created for each connected component of H. 

Let Ti and Hi be the values of T and H at the end of the ith iteration. Suppose that 
the invariant holds at the end of the A;th iteration. By lemma 6, H^j^i is a subset of H].. 
Hence, each connected component of H^j^i is a subset of a connected component of H^. 
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Moreover, each connected component J of is either a connected component of H^+i or 
is partitioned into several connected components of -fffc+i. In the former case, the leaf of 
Tk corresponding to J is unchanged and remains a leaf in T/j+i. In the latter case, the leaf 
corresponding to J gets assigned one child for each connected component of iJjt+i that is a 
subset of J. Thus, the connected components of iJ^+i correspond to the leaves of T^+i. □ 

Lemma 12: If procedure solve_constraints does not return false, then it returns a well- 
formed cluster tree T. 

Proof: Using lemma 11, and the cleanup section of solve_constraints which creates the final 
leaves for symbols, it follows that every symbol in S ends up in a single leaf of T. As rn is 
decremented on each iteration, and as no iteration adds both a new node and children of 
that node, it follows that the label of each internal node is less than the label of its father. 
Hence the constraints on cluster trees (definition 3) are satisfied. □ 

Lemma 13: Let a, h be two distinct symbols in S and let T be the cluster tree returned 
by solve_constraints for S. Let N be the least common ancestor of a, h in T. Then either N 
is assigned its label on the first iteration when the edge ah is not in H, or the edge ah is in 
the final value of H when the loop is exited and TV is assigned its label in the final cleanup 
section. 

Proof: As above, let H-i be the value of H in the 'ith iteration. 

If N is the root, then it is assigned its label in the first iteration. Clearly, a and 6, being 
in different subtrees of JV, must be in different connected components of Hi. 

Suppose N is assigned its label in the A;th iteration of the loop for A; > 1. By lemma 11, 
at the end of the previous iteration. A''. symbols was a connected component oi H^^i, and it 
therefore contained the edge ah. Since N is the least common ancestor of a, h, it follows that 
a and h are placed in two different children of N; hence, they are in two different connected 
components of H^. Thus the edge ah cannot be in H^. 

Suppose N is assigned its label in the cleanup section of the algorithm. Then by lemma 
11, A^. symbols is a connected component of the final value of H. Hence the edge ah was in 
the final value of H. □ 

Lemma 14: Let S initially contain constraint C of form od(a, b) <^ od(c, d). Suppose that 
solve_constraints((S) returns a cluster tree T. Let M be the least common ancestor of a,h 
in T and let N be the least common ancestor of c, d. Then M. label < AT. label. 

Proof: Suppose A^ is given a label in a given iteration. By lemma 13, cd is eliminated 
from H in that same iteration. By lemma 10, constraint C must be in S at the start of the 
iteration. Hence ah is a short of S in the iteration, and is therefore in H. Hence M is not 
given a label until a later iteration, and therefore is given a lower label. 

It is easily seen that cd cannot be in H in the final iteration of the loop, and hence AT 
is not assigned its label in the cleanup section. □ 

Lemma 15: Suppose that solve_constraints((S) returns a cluster tree T. Then any instan- 
tiation of T satisfies the constraints S. 

Proof: Immediate from lemma 14 and corollary 4. 



30 



Order of Magnitude Comparisons of Distance 



Theorem 1: The algorithm solve_constraints(5) returns a cluster tree satisfying 5 if 5 is 
consistent, and returns false if 5 is inconsistent. 

Proof: If solve_constraints((S) returns false, then it is inconsistent (lemma 9). If it does 
not return false, then it returns a cluster tree T (lemma 12). Since T has an instantiation 
(lemma 2) and since every instantiation of T is a solution of S (lemma 15), it follows that 
S is consistent and T satisfies S. □ 

Proof of Theorem 2 

Lemma 16: If Si and ^2 are consistent sets of constraints, and Si D ^2 then 
reduce_constraints(5i) I) reduce_constraints(52). 

Proof: Immediate by construction. The value of if in the case of 5i is a superset of its value 
in the case of ^2, and hence reduce_constraints(5i) is a superset of reduce_constraints(52). 

Lemma 17: If 5i and S2 are consistent sets of constraints, and Si D S2 then num_labels((Si) 

> num_labels(52). 

Proof by induction on num_labels(52). If num_labels(52) = 0, the statement is trivial. 
Suppose that the statement holds for all 5', where num_labels(5') = k. 
Let num_labels(52) = A; + 1. 

Then A; + 1 = num_labels(52) = 1 + num_labels(reduce_constraints(52)), so 
k =num_labels(reduce_constraints(52)). Now, suppose Si D ^2. By lemma 16 
reduce_constraints(5i) D reduce_constraints(52). But then by the inductive hypothesis 
num_labels(reduce_constraints(5i)) > num_labels(reduce_constraints(52)), so 
num_labels(5i) > num_labels(52). □ 

Lemma 18: Let 5 be a set of constraints, and let F be a solution of S. For any graph G 

over the symbols of S, let nd(G, F) be the number of different non-zero values of od(a, b) 
where edge ab is in G. Let edges((S) be the set of edges in S. Then nd(edges((S), F) > 
num_labels(5). 

Proof: by induction on num_labels(5). If num_labels(5) = 0, then the statement is trivial. 
Suppose for some k, the statement holds for all S' where num_labels(5') = k, and suppose 
num_labels(5) = A; + 1. Let pq be the edge in S of maximal length. For any set of edges E, 
let small-edges (£^, F) be the set of all edges ab in E for which 

od(r(a), F(6)) <^ od(F(jo), F(g)). Since small-edges (_B) contains edges of every order of 
magnitude in E except the order of magnitude of pq, it follows that 

nd(small-edges(£^, F), F) = nd(£^, F) — 1. Let G be the complete graph over all the symbols 
in S. By the same argument as in lemma 5, small-edges(G, F) D H, where H is the connected 
components of the shorts of S, as computed in reduce_constraints(<S). Let S' be the set of 
constraints whose longs are in small-edges (G, F). It follows that S' ^ reduce_constraints((S). 
Now small-edges (G, F) D edges(5') 5 edges(reduce_constraints(5)). 
Hence nd(edges(5), F) = nd(G,F) = nd(small-edges(G, F), F) + 1 

> nd(edges(reduce_constraints(5))) + 1 > (by the inductive hypothesis) 
num_labels(reduce_constraints(5)) + 1 = num_labels((S). □ 
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Theorem 2: Out of all solutions to the set of constraints 5, the instantiations of 
solve_constraints(5) have the fewest number of different values of od(a, b), where a, b range 
over the symbols in S. This number is given by num_labels((S). 

Proof: Immediate from lemma 18. 

Corollary 19: Let Q have all the properties of an om-space except that it has only k 
different orders of magnitude. A system of constraints S has a solution in if and only if 
the tree returned by solve_constraints(5) uses no more than k different labels. 

Proof: Immediate from theorems 1 and 2. □ 

Proof of Algorithm for Non-strict Comparisons 

We now prove that the revised algorithm presented in Section 6.2 for non-strict comparisons 
is correct. The proof is only a slight extension of the proof of theorem 1, given above. 
Recall that the revised algorithm in Section 6.2 replaces the line of solve_constraints 

H := the connected components of the shorts of <S 

with the following code: 

1. H := the shorts of S; 

2. repeat H := the connected components of H; 

3. for each weak constraint od(a, b) ^ od(c, d) 

4. if cd is in H then add ab to H endif endfor 

5. until no change has been made to H in the last iteration. 

We need the following new lemmas and proofs: 

Lemma 20: Let 5 be a set of strict comparisons, and let W be a set of non-strict com- 
parisons. Let H be the set of edges output by the above code. If 5 U W is consistent, then 
there is an edge in S that is not in H. 

Proof: As in the proof of lemma 5, let F be a valuation satisfying 5 U W and let pq be 
an edge in S such that od{T{p),T{q)) is maximal. We wish to show that, for every edge 
ab G H, od(r(a), r(6)) <^ od(r(|>), r(g)), and hence ab pq. Proof by induction: suppose 
that this holds for all the edges in H at some point in the code, and that ab is now to be 
added to H. There are three cases to consider. 

• ab is added in step [1]. Then, as in lemma 5, there is a constraint od(a, 6) <^ od(c, c?) 
in S. Hence od(r(a), r(6)) < od(r(c), r(d)) < od(r(p), r(g)). 

• ab is added in step [2]. Then there is a path ai = a, 02 ... a*; = b such that the edge 
ajaj+i is in ii" for « = 1 ... A; — 1. By the inductive hypothesis, od(r(a2), r(a2_|_i)) <^ 
od{r {p),T{q)). By the om-triangle inequality, 

od(r(a),r(6)) < maxi=i..;fc_i(od(r(ai),r(a,+i))) « od{T{p),r{q)). 

• ab is added in step [4]. Then there is a constraint od(a, b) ^ od(c, d) in W such that 
cd is in H. By the inductive hypothesis, od(r(c), r((i)) <^ od{T{p),T{qj). 
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□ 

Lemma 21: Let W contain the constraint od(a, b) ^ od(c, d). Suppose that the algorithm 
returns a cluster tree T. Let M be the least common ancestor of a and b in T, and let N 
be the least common ancestor of c and d. Then M. label < iV. label. 

Proof: By lemma 13, N is assigned a label in the first iteration where H does not include 
the edge cd. In all previous iterations, since cd is in H, ah will likewise be put into H. 
Hence M does not get assigned a label before N , so M. label < A?". label. 

The remainder of the proof of the correctness of the revised algorithm is exactly the 
same as the proof of theorem 1. 

Validation of Algorithm Solve_constraints2 

The proof of the correctness of algorithm solve_constraints2 is again analogous in structure 
to the proof of theorem 1. We sketch it below: the details are not difficult to fill in. 

1. (Analogue of lemma 2:) If T is an ordered cluster tree, then the revised version of 
instantiate (T) returns an instantiation of T. The proof is exactly the same as lemma 
2, with the additional verification that instantiate2 preserves the orderings in T. 

2. (Analogue of lemma 5:) Let 5 be a set of order-of- magnitude constraints on distances, 
and let O be a set of ordering constraints on points. Let H be the graph given by the 
two statements 

H := the connected components of the shorts of S; 
H := incorporate_order(i?, O); 

If S and O are consistent, then H does not contain all the edges of S. 

Proof: As in the proof of lemma 5, choose a valuation F satisfying 5, O and let pq be 

an edge in S for which od(r(p), r(g)) is maximal. Following the informal argument 
presented in Section 6.3, it is easily shown that pq is longer than any of the edges 
added in these two statements, and hence it is not in H. 

3. (Analogue of lemma 9:) If solve_constraints2 returns false, then S, O is inconsistent. 
Proof: Immediate from (2). 

4. (Analogue of lemma 12:) If solve_constraints2(5, O) does not return false, then it 
returns a well-formed ordered cluster tree. 

Proof: By merging the strongly connected components of G, incorporate_order always 
ensures that the ordering arcs between connected components of form a DAG. These 
arcs are precisely the same ones that are later added among the children of node N as 
ordering arcs. Thus, the ordering arcs over the children of a node in the cluster tree 
form a DAG. Otherwise, the construction of the tree T is the same as in lemma 12. 

The remainder of the proof is the same as the proof of theorem 1. 
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Proof of Theorem 3 

We begin by proving lemma 22, that the revised version of "instantiate", given in Section 
6.3, gives an instantiation of a cluster tree in Euclidean space. 

Lemma 22: Any cluster tree T has an instantiation in Euclidean space 3?™ of any dimen- 
sionality m. 

The proof is essentially the same as the proof of Lemma 2, except that we now have 
to keep track of real quantities. For any node TV, if «=7V. label, we define A{N) = 8i. The 
proof then proceeds in the following steps: 

i. For any i < j, 6i < 6j/a^~'^. Immediate by construction. 

ii. For any nodes M,C, if M is a descendant of C in T then 
dist(G[M], G[C]) < anA{C)/{a - 1). 

Proof: Let C = Co, Ci ... Cr = M he the path from C to M through T. Then 
dist(G[M],G[C]) < (by the triangle inequality) EI^odist(G[Ci+i], G[Ci]) < 
ElZ^{nA{C)/a^) < («/(«- l))(nA(C)). 

iii. Let N he a node in T; let Ci and C2 be two children of N; and let Mi and M2 be 
descendants of Ci and C2 respectively. Then 

A(iV)(l - 2n/{a - 1)) < dist(G[Mi], ^[Ma]) < nA(A^)(l + 2/{a - 1)) 
Proof: By the triangle inequality, 

dist(G[Ci], G[C2]) < dist(G[Ci], G[Mi]) + dist(G[Mi], G[M2]) + dist(G[M2], G[C2]). 
Thus, dist(G[Ci], G[C2]) - dist(G[Ci], G[Mi]) - dist(G[M2], G[C2]) < dist(G[Mi], G[M2]). 
Also, by the triangle inequality, 

dist(G[Mi], G[M2]) < dist(G[Gi], G[G2]) + dist(G[Gi], G[Mi]) + dist(G[M2], G[G2]). 
By construction, A(A^) < dist(G[Gi], G[G2]) < nA(iV), 

and by part (ii), for « = 1,2, dist{G[Mi],G[Ci]) < anA{C)/{a-l) < nA{N)/{a-l) 
as A(G) < A{N)/a. 

iv. For any symbols a, b, c, d in T, let P he the least common ancestor of a, b and let N 
be the least common ancestor of c, d. If P. label < JV. label then 
much_closer(G[a], G[b], G[c], G[d]). 

Proof: By part (iii), dist(G[a], G[b]) < nA(P)(l + 2/{a - 1)) 

and dist(G[c], G[d]) > A{N){1 - 2n/{a - 1)). Since A(P) < A{N)/a and since 

a = 2 + 2n + Bn, it follows by straightforward algebra that 

dist(G[a],G[6]) < dist(G[c], G[d]) / B. 

□ 

We next prove the analogue of lemma 5. 

Lemma 23: Let 5 be a set of constraints over n variables of the form 

"dist(a, b) < dist(c, d) / P", where B > n. If 5 is consistent, then there is some edge in S 

which is not in the connected components of the shorts of S. 

Proof: Let F be a valuation satisfying S. Let pq be the edge in S for which dist(r(p), r(g)) 
is maximal. Now, if ab is a short of 5 — that is, there is a constraint much_closer(a, b, c, d) 
in 5 — then dist(r(a), r(6)) < dist(r(c), r(d))/P < dist{T {p),T{q))/B. 
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Now, let ab be any edge in H, the connected components of the shorts of S. Then 
there is a simple path ai = a, 02 • • • Ofc = b such that the edge ajflj+i is a short of S for 
i = 1 . . . k — 1. Note that k < n. Then, by the triangle inequality, 

dist(r(a),r(6)) < 

dist(r(ai),r(a2)) + dist(r(a2),r(a3)) + ...+ dist(rK_i),rK)) < 

(A; - l)dist(r(p),r(g)) / B < dist(r(p),r(g)) 
Hence pq 7^ ab, so pq is not in H. □ 

Theorem 3: Let 5 be a set of constraints over n variables of the form "dist(a, b) < dist(c, d) 
I B'\ where B > n. The algorithm solve_constraints(5) returns a cluster tree satisfying S 
if S is consistent over Euclidean space, and returns false if S is inconsistent. 

Proof: Note that the semantics of the constraints "much_closer(a, 6, c, d)" enters into the 
proof of Theorem 1 only in lemmas 2 and 5. The remainder of the proof of Theorem 1 has to 
do purely with the relation between the structure of S and the structure of the tree. Hence, 
since we have shown that the analogues of lemmas 2 and 5 hold in a set of constraints of 
this kind, the same proof can be completed in exactly the same way. □ 

Proof of Theorem 4 

Lemma 24: Let T be a cluster tree and let F be a valuation over om-space satisfying T. 
Let z be a symbol not in T, let a be a point in f2, and let V be the valuation TU {x ^ a}. 
Then there exists an extension T' of T by a; such that F' satisfies T'. 

Proof: If T is the empty tree, the statement is trivial. If T contains the single symbol y, 
then if a = T{y) then operation (2) applies with M.label=0; if a / r(?/) then operation (2) 
applies with M.label=l. 

Otherwise, let y be the symbol in T such that od{T{y),a) is minimal. (We will deal 
with the case of ties in step (D) below.) Let F be the father of y in T. 

Let D=od{T{y),a). Let V be the set of all orders of magnitude od(F(j;), r(g)), where 
p and q range over symbols in T. We define L to be the suitable label for D as follows: If 
-D G F, then L is the label in T corresponding to D. If D is larger than any value in V 
then L is the label of the root of T plus 1. li D ^ V, but some value in V is larger than D, 
then let Di be the largest value in V less than D; let D2 be the smallest value in V greater 
than D; let Li, L2 be the labels in T corresponding to Di, D2; and let L = {Li + L2)/2. 

One of the following must hold: 

A. F(y) = a, and F.label=0. Then apply operation (3) with N = F. 

B. r{y) = a and F.label / 0. Then apply operation (4) with M.label = 0. 

C. F(y) / a, but od(r(y),a) is less than od{T{z),a) for any other symbol z ^ y inT. 
Apply operation (4) with M.label set to the suitable value for D in T. 

D. There is more than one value yi . . . yk for which od(F(yj), a) = D. It is easily shown 
that in this case there is an internal node Q such that yi . . . y^ is just the set of symbols 
in the subtree of Q. There are three cases to consider: 
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D.i D=odiam(r(Q. symbols)). Then apply operation (3) with N = Q. 

D.ii D > odiam(r(Q. symbols)), and Q is not the root. Then apply operation (5) 
with C = Q. Set M. label to be the suitable value for D. (It is easily shown that 
D < odiam(r(A''. symbols)), where N is the father of Q.) 

D.iii D > odiam(r((3. symbols)), and Q is the root. Apply operation (6). 

□ 

Lemma 25: Let A = {ai ...dfc} be a finite set of points whose diameter has order-of- 
magnitude D. Then there exists a point u such that, for « = 1 ... A;, od{u, ai) = D. 

Proof: Let 6i = ai. By axiom A. 8 there exists an infinite collection of points 625^3 ••• 
such that od(6i, bj) = D ioi i ^ j. Now, for any value there can be at most one value bj 
such that od(aj, bj) <^ D; if there were two such values bji and 6^25 then by the om-triangle 
inequality, od(6ji, 6^2) D. Hence, all but k different values of bj are at least D from any 
of the flj. Let u be any of these values of bj. Then since od{u, ai) = D and od(ai, a-j) ^ D 
for all i, it follows that od{u, a^) D for all Oj. Thus, since od{u, ai) ^ D but not od{u, ai) 
< D, it follows that od{u,ai) = D. a 

Lemma 26: Let T be a cluster tree; let F be a valuation over om-space satisfying T; 
and let T' be an extension of T by x. If Q is dense and unbounded above, then there is a 
value a such that the valuation T U {x ^ a} satisfies T'. 

Proof: For operations (1) and (2) the statement is trivial. 

Otherwise, let L be an extending label of T. If L = 0, then set D = 0. If L is in T, 
then let D be the order of magnitude corresponding to L in T under F. If Li < L < L2 
where Li and L2 are labels of consecutive values in T, then let Di and D2 be the orders of 
magnitude corresponding to Li, L2 in T under F. Let D be chosen so that Di <^ D <^ D2. 
If L is greater than any label in the tree, then choose D to be greater than the diameter of 
the tree under F. 

If T' is formed from T by operation (3), then using lemma 25 let a be a point such that 
od{a,T{y)) = odiam(JV) for all y in JV. symbols. 

If T' is formed from T by operation (4), then let a be a point such that od(a, F(?/)) = 

D. 

If T' is formed from T by operation (5), then let a be a point such that od(a, F(y)) = 
D for all y in C. symbols. (Note that, since M. label < JV. label, D < odiam(7V. symbols).) 

If T' is formed from T by operation (6), then let a be a point such that od(a, F(?/)) = 
D for all y in _R. symbols. 

In each of these cases, it is straightforward to verify that F U {a; — )■ a} satisfies T'. □ 

As we observed in Section 8 regarding lemma 28, the conditions on Q, in lemma 26 
are necessary, and the statement is false otherwise. For example, let be the om-space 
described in example I, Section 3, of polynomials over an infinitesimal S. Then Q is not 
unbounded above; there is a maximum order-of-magnitude 0(1). Let T be the starting tree 
of Figure 3 (upper- left corner), and let T' be the result of applying operation 6 (middle 
bottom). Let F be the valuation {u ^ 5,v ^ 25,w ^ 1}. Then F satisfies T, but it cannot 
be extended to a valuation that satisfies T', as that would require x to be given a value 
such that od{v,w) <^ od{x,w), and no such value exists within Q. The point of the lemma 
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is that, if f2 is required to be both dense and unbounded above, then we cannot get "stuck" 
in this way. 

Lemma 27: Let T be a cluster tree. Let X be a variable not among the symbols of T. 
Let a be an open formula in £, whose free variables are the symbols of T and the variable 
X. Let 4> be the formula 3xOi- Let be an om-space that is dense and unbounded above. 
Then there exists an instantiation F of T in J7 that satisfies 4> if and only if there exists an 
extension T' of T and an instantiation V of T' that extends V and satisfies a. 

Proof: Suppose that there exists an instantiation F of T that satisfies 3xQ;- Then, by 
definition, there is a point a \n VL such that F satisfies a{X/a). That is, the instantiation 
F U {X -> a} satisfies a. Let F' = F U {X ^- a}. By lemma 24, the cluster tree T' 
corresponding to F' is an extension of T. 

Conversely, suppose that there exists an extension T' of T and an instantiation F' of T' 
satisfying a. Let F be the restriction of F' to the symbols of T. Then clearly F satisfies the 
formula Bxa- □ 

Lemma 28: Let T be a cluster tree. Let be an open formula in £, whose free variables 
are the symbols of T. Let Q be an om-space that is dense and unbounded above. If one 
instantiation F of T in satisfies 4' then every instantiation of T in satisfies 4'- 

Proof: We can assume without loss of generality that the only logical symbols in (f) are 
(not), A (and), 3 (exists), = (equals) and variables names, and that the only non- logical 
symbol is the predicate "much_closer" . We now proceed using structural induction on the 
form of 0. Note that an equivalent statement of the inductive hypothesis is, "For any formula 
■tp, either tp is true under every instantiation of T, or ip is false under every instantiation of 

Base case: If 4> is an atomic formula "'X = Y" or "much_closer(Ty, X, Y, Z)'' then this 
follows immediately from corollary 4. 

Let (p have the form If (p is true under F, then is false under F. By the in- 
ductive hypothesis, ip is false under every instantiation of T. Hence <p is true under every 
instantiation of T. 

Let <p have the form ip f\6. If is true under F then both 'ip and are true under F. By 
the inductive hypothesis, both 'ip and 6 are true under every instantiation of T. Hence (p is 
true under every instantiation of T. 

Let (j) have the form B^Qi- If 4> is true under F then by lemma 27, there exists an 
extension T' of T and a instantiation F' of T' such that a is true under F'. By the inductive 
hypothesis, a is true under every instantiation of T' . Now, if A' is an instantiation of T' 
that satisfies o, and A is the restriction of A' to the variables in T, then clearly A satisfies 
3x0- But by lemma 26, every instantiation A of T can be extended to an instantiation A' 
of T' . Therefore, every instantiation of T satisfies (p. □ 

Theorem 4: Let T be a cluster tree. Let (f) be an open formula in £, whose free variables 
are the symbols of T. Let be an om-space that is dense and unbounded above. Algorithm 
decide(T, (p) returns true if T satisfies (p and false otherwise. 

Proof: Immediate from the proof of lemma 28. □ 
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